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Abstract — The Gaussian broadcast channel (GBC) with K 
transmit antennas and K single-antenna users is considered for 
the case in which the channel state information is obtained at 
the transmitter via a finite-rate feedback link of capacity r bits 
per user. The throughput (i.e., the sum-rate normalized by K) 
of the GBC is analyzed in the limit as K — > oo with — > f. 
Considering the transmission strategy of zeroforcing dirty paper 
coding (ZFDPC), a closed-form expression for the asymptotic 
throughput is derived. It is observed that, even under the finite- 
rate feedback setting, ZFDPC achieves a significantly higher 
throughput than zeroforcing beamforming. Using the asymptotic 
throughput expression, the problem of obtaining the number of 
users to be selected in order to maximize the throughput is solved. 

Index Terms — broadcast channel, dirty paper coding, inflation 
factor, zeroforcing beamforming. 

I. Introduction 

THE Gaussian broadcast channel (GBC) has been in- 
tensely researched in recent years. It is well-known that 
dirty paper coding (DPC) [T] achieves the capacity region 
of the GBC if perfect channel state information (CSI) is 
available at the transmitter (CSIT) and the receivers (CSIR) 
[2 |. However, even though the assumption of perfect CSIR can 
be justified, it is unrealistic to assume the same about CSIT. 
Moreover, the rate achievable over the GBC is quite sensitive 
to the quality of CSIT as has been demonstrated in 0-J5) 
(see also references in [5]). This paper therefore tackles the 
important problem of achieving high throughputs using DPC 
over the GBC with imperfect CSIT. 

It is well known that under perfect CSIT the DPC based 
transmission outperforms other known strategies such as ze- 
roforcing beamforming (ZFBF) [3|. Nevertheless, under im- 
perfect CSIT, it is the ZFBF strategy that has been intensely 
researched rather than DPC, mainly because ZFBF is analyt- 
ical tractable |4|, (6) and because of a perception that DPC 
based schemes are either not feasible without perfect CSIT 
or, even if feasible, they may be analytically intractable. The 
main hurdle with DPC is seen to be the difficulty of designing 
the inflation factor - a parameter that can critically affect its 
performance fl] - without perfect CSIT; it is even generally 
believed that the inflation factor cannot be effectively designed 
without perfect CSIT O implying that DPC may be overly 
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sensitive to the imperfection in CSIT, thereby rendering it less 
desirable than even ZFBF. 

Making progress to this end, we recently developed iterative 
numerical algorithms for the determination of inflation factor 
under imperfect CSIT which yield high achievable rates Q, 
|8|. Some analytical results were also obtained in the high/low 
SNR (signal-to-noise ratio) regime j£|, iflOl . However, these 
results may not always reveal much insight on how DPC works 
with imperfect CSIT nor does it shed light on the behavior 
of DPC at moderate values of SNR. Moreover, due to the 
numerical nature of these algorithms, it is almost impossible to 
derive analytical results regarding the finite SNR performance 
of DPC based strategies or on how they compare with other 
transmission strategies such as ZFBF. Furthermore, the algo- 
rithms don't lend themselves to answering important design 
questions about DPC based schemes - such as optimizing 
the sum-rate by selecting (and transmitting to) only a subset 
of users - other than through a tedious and un-insightful 
exhaustive search. Recall that the strategy of transmitting to 
a subset of users is known to indeed result in a considerable 
improvement in the sum-rate under perfect CSIT |3| and for 
the ZFBF even under imperfect CSIT (6). 

To address the above issues, we undertake here a large- 
system or asymptotic analysis of the GBC with K trans- 
mit antennas and K single-antenna users (i.e., the GBC of 
size/dimension K) in which the CSIT is obtained via a finite- 
rate feedback link of capacity r bits per user per channel 
realization (or coherence interval). In particular, for the trans- 
mission strategy of zeroforcing DPC (ZFDPC) 0, |9), we 
analyze the normalized sum-rate or the throughput (i.e., the 
sum-rate divided by K) of the GBC in the limit as K — > oo 
with -fe — > f . Such a problem has been considered before in 
the special case of perfect CSIT (i.e., f = oo) in [0 and for 
the finite-rate feedback GBC with the simpler transmission 
strategy of ZFBF in (6). 

In the large-system limit, the involved random variables 
converge to their deterministic limits Bill . Therefore, the 
large-system analysis yields a closed-form expression for the 
asymptotic throughput (i.e., the throughput in the limit of 
K — > oo), the evaluation of which involves a simple easy-to- 
compute numerical integral. Importantly, unlike many works 
that deal with high SNR characterizations (c.f., and the 
references therein) the asymptotic throughput is obtained as 
a function of SNR, and hence, it can provide insights at 
any finite SNR. It also serves as a simple semi-analytic tool 



for the comparison of different transmission strategies. In 
particular, contrary to popular belief, we show that even under 
imperfect CSIT ZFDPC does indeed achieve a significantly 
higher throughput than ZFBF. Furthermore, the asymptotic 
analysis helps to definitively answer the design question of 
optimizing over the number of users to be transmitted to. It 
is also een that this method when mapped simply to finite 
dimensions works quite accurately even for the relatively small 
values of K. Thus the asymptotic analysis is seen to offer 
useful insights about finite-dimensional GBCs as well. 

Notations: For a matrix/vector A, A* is its complex- 
conjugate transpose. CAf(0, 1) denotes the circularly symmet- 
ric complex normal mean-0 variance- 1 random variable (RV), 
while X2K denotes the chi-square RV of mean K.h ~ CN(K) 
denotes the vector h of dimension K consisting of independent 
£A/"(0, 1) RVs. For any vector a, a denotes its direction, i.e., 
a = tAt, where ||o|| denotes the norm of a. For a vector hi, 
the perpendicular space of it and the orthonormal basis vectors 
spanning the perpendicular space, both, are denoted by pi (the 
meaning is to be understood from the context). Almost-sure 
convergence lfl2l is denoted by a.s. Ik € C KxK is an identity 
matrix. For RVs A and B, A±B denotes independence. All 
logarithms are to base 2. 

II. System Model of GBC 



Consider the GBC of size K. The received signal at the i 



th 



user is given by yi = h*x + Zi, where h* £ 
nel vector of the i th user, x G 



-lx K 



is the chan- 



■<Kxl 



is the signal transmitted 
under the power constraint of P, and Zi ~ CAf(0, 1) is the 
additive noise. We assume that hi ~ CN(K) are independent. 
Let (Ji £ C Kxl denote the BF vector for the i th user and let 
1 1 = 1. Let ui be the data symbol to be sent to the user 
i (ui's are independent). Then the total transmitted signal is 
given by x — ^2f =1 uiiUi. Let H = [hi h 2 ha---hg;]- We 
assume perfect CSIR. Define SNR = P. 

W consider the so-called 'on-off power allocation policy 
which is that the transmitter selects a set, denoted A on , of 'on' 
users and transmits with equal power to the selected users. In 
fact, under perfect CSIT, such a scheme is near optimal. To 
be precise, the difference between the asymptotic throughput 
achieved with the optimal waterfilling-type power allocation 
policy [3 1 and that obtained using the on-off power policy 
is negligible. Hence, we consider here only the on-off power 
policy. Under this scheme, if i e A on then u,i ~ CA/"(0, — ), 
where s :— \A on \, else Ui = 0. We let — >• s as K — > 00. 
Thus, the asymptotic throughput is obtained as a function of s, 
which allows us to answer the design problem of optimization 
over the fraction of users. 



A. Quantization Scheme 

In the limit of large K, RVs maxi<i<#- -^||/ii|| 2 and 
mini<i<if -L||/ii|| 2 , both, converge to 1 in probability (6] 
Proposition 1]. Therefore, we find it sufficient, for the present 
purpose, to feedback only the channel directions, namely the 
hi's. Let r denote the number of feedback bits per user. 
Each user has a codebook C = {qj}j =1 consisting of 2 r 
If -dimensional unit-norm vectors. The vector hi is quantized 
according to the rule: hi = argmin^.gc sin 2 (Z(/ij, Qj))- 
Denote by d 2 c (i) (or simply <i 2 ) the quantization error 
sin 2 (Z(hi,hi)y We further assume that £ — > f. We define 
H = [hi h 2 h 3 --- h K }. 

In this paper, we assume the quantization cell approximation 
(called the quantization-cell upper-bound (QUB)) fl3l . Under 
this approximation, we assume ideally that the quantization 
cell around each vector of the codebook is a spherical cap of 
area 2 _r times the total surface area of the unit sphere (131 . 
It has been shown that this approximation yields an upper- 
bound to the performance [13], [14]; and this upper-bound 
has been observed to be tight [14|. The tightness of the bound 
is a property that depends mainly on how the quantization 
error is modeled under the approximation, and not so much 
on the channel or the transmission scheme for which the 
approximation is being used. Hence, QUB is a reasonable 
assumption to make for the present analysis. 

Lemma 1: If we write hi = \Jl — d^(i)hi + ^d1(i)ii then, 
under the QUB model, ii is isotropically distributed in pi and 
is independent of c? 2 (i). Also, d 2 (i) — » 2~ r =: D a.s. 

We consider here an ensemble of codebooks {UC} where 
U is a Haar distributed unitary matrix and C is a given 
codebook. Then hi is isotropic. We take the expectation over 
the codebooks as well although this is not shown explicitly in 
subsequent formulas. 

B. The Achievable Rate 

Assume that the users are encoded according to their natural 
order. We focus here on ZFDPC scheme [3], [9|. To obtain the 
BF vectors under this scheme, we perform QR-decomposition 
fl3) of H when there is perfect CSIT [3|; i.e., let H = QR. 
The columns of Q are respectively the BF vectors of the users. 
Under finite-rate feedback, we perform the same procedure 
with H j9]. Note that the BF vectors are orthogonal and (under 
finite-rate feedback) h*ujj = 0, Vj > i. 

We select the auxiliary random variable for the i th user 
as Ui = m + Wi[u\ ■ ■■u* i _ 1 }*, where W t € C 1 *^" 1 ) is the 
inflation factor for the i th user Q, (8), (U- Then the achievable 
rate for the i th user is given by equation (Q~|) below (see [8]). 

In this scheme, the interference (at user i) due to the users 
encoded previously (i.e., users 1 to i — 1) is treated by DPC 



Ri = E H maxE^jj log 



where nr = 1 



nr • (1 + || Will 2 ) - f h*(io t + [wi ■ --Wi-ilW?) 
Wi = —¥. H ^{uj*h l h*[ui ■ ■ '^-iljil^ 1 where M; = E^^nr • I, 



P r 



P 



K 



OJj 



3=1 



i-1 CJl • • • Wi_iJ 

S 



(1) 



(2) 



whereas that due to the users encoded afterwards (i.e., users 
i + 1 to s) is treated by zeroforcing; hence the name ZFDPC. 

Our choice of the inflation factor is stated in equation (O, 
which is obtained from [8 1 and [9|. It is derived by first moving 
the conditional expectation in equation (fl]i inside the logarithm 
to obtain an upper-bound on the rate and then maximizing this 
upper-bound over the inflation factor. 

III. Evaluation of the Asymptotic Throughput 

We want to evaluate the limit, lim^- £\ Ri, where each 
Ri depends on the user index i, and hence, on the normalized 
user index |. In the limit, the normalized user index would 
take a value from the continuum, i.e., i := lim^ - € [0, 1]. We 
thus anticipate that, as K — s- oo, the above summation would 
converge to an integral over i and the integrand of which would 
be the limit of Ri. 

To compute the limit, we first need to evaluate the condi- 
tional expectations in d2j. We start below with E H ^(hih*). 
Then we compute E^^nr and lim^ nr. To this end, note that 

nr = l + £|^| 2 + f Zj<i k*^f + T £ fc>l \h>k?\ each 
of these terms is dealt separately in Subsections IIII-Al to lIII-CI 
respectively. Later, in Subsection IIII-DI we find Wi in closed 
form. Finally, we evaluate the limit of the terms involved in 
the denominator of the argument of the logarithm in ((T). We 
define D = E H , A (%(i). Then lim K D = D = 2~ f \ 

Lemma 2: E H ^(h l h*) = (1 - D - -j^hh* + j^jIk- 
Proof: We first prove that Ee^ = 0. To this end, consider 



the unitary matrix U = [hi pi 




V 



It* 

P* 



, where V <E 



g(K i)x{k i) j s gjjy arbitrary unitary matrix. Now, Uhi = hi 
and Upi = piV. Hence, Ue-i is isotropic in pi, which implies 
that UE HlA (S l )=E mA (i t )=0. 

Now, K H ^(iie*) must be of the form PiQp* where 
Q is positive semi-definite matrix. We can prove that 
UE m &(eiEi)U* = E Hlfi (eiel) => VQV* = Q, V unitary 
V. This implies that Q = k ■ Ik-i with k chosen such that 
tr(Q) = 1. Hence E H<[H {e t i*) = P.-^Ik-iP*- 

Now using the decomposition of hi given in Lemma [T] and 
noting the fact that pip* = Ik — hih* we obtain the result. ■ 



A. Analysis of -\h* 



\h*Ui 



K [ hi Pi ][ fH pi] 



h*hih* 



KPiPiUi 



cross terms . 



1) ^hfhihtiVil 



■\hiUJi 



Therefore, 



= £(l-D)\hiUi 



We know I ft* L 1 2 -> 1 



%2(K-i+l) 



IfTTl . Hence, 



D a.s. Note that 

2 



ft* 



\hi\\- h*uJi 
\\hi\\h*c 



(I -is) a.s. Therefore ^\h*h t h*uJi\ 2 -> |(l-D)(l-is) a.s. 
2) Now pip* hi = ei with Ej^,^- 1 |ei 1 1 2 = D. Let us consider 

h' ~CN(K-l) ±ei. Let a i =pip* i u i ; |H| 2 = l-\h*uj t \ 2 . 
Conditioned on H, a, G p, is a deterministic direction. Hence, 



conditioned on H, a* 

2 



E 



H\H 



e,- uji 



K-l H\H 



DK 



lei ~ CA/YO, 1). Therefore, 

2 



l-\h*L 



K-l 



and 



E H\Hl\KPiPi"if 

To compute the limit, note that a*ii behaves like CJ K ^_'^ > , 
as far as the limit is concerned. Since the other multiplicative 
terms remain bounded in limit, this term converges to zero a.s. 

3) One of the cross terms is ||fti|| 2 h*hih*WiU>*pip*hi. Now, 
h\hi = y/l - d 2 (i); PiP*hi = ^fd 2 c {i)ei. Since d c (i)±ei and 
^H\H^i — 0' me conditional expectation of the cross terms is 
zero. Their limit can also be shown to be zero a.s. 

B. Analysis of i 2~2j<i \h*^j\ 2 

The conditional expectation can be computed using the 
techniques developed in Subsection IIII-AI We directly state 
the main result. Note that \h*uj t \ 2 = 1 - J2 3 <i \K^j\ 2 and 
\h*Uj\ 2 = \h*hih*ujj\ 2 + \h*pip*ujj\ 2 + cross terms. Then 
^h\hE 3<1 = K(l D)(l - \h*^\ 2 ). 

^h\h Ej<i iKPiPt^l 2 = ~ 2 + fh*^ 2 

To compute the limit, we have toj-Lhi Vj < i. Thus, 
h*ujj are independent ~ CAA(0, 1) random variables. Hence 

C. Analysis of i Y,k>i IK^^ 2 

To compute the conditional expectation, the techniques 
developed in Subsection IIII-AI are used. We omit the details 
and state the main result: ^E H ^ J^k>i \K^k\ 2 = t^zjD^. 

e*H 2 • Kei| 2 . 



Since u k e pu we ge t \h*oo k \ 2 = \\h\\ 2 
As in Subsection IIII-AI introduce ft' ~ CJ\f(K - l)-L$i. 



Then conditioned on H,J2k 



k>i 



J k ■ 



X2(s-i)' 



hence 



unconditionally also, it has the same distribution. This gives 
l s T.k>i\h>k?^D{l-i) a.s. 

D. Computation of the inflation factor 

Let us define I* = h*\uJi W2 • • -Wj-i]. Then we have: 

K H MKiht[ui ■ ■ • wi-i]) = (!-£>- -^)(ufhi)ll 



E H{H nr 
Mi = 



nr = 1 -f 

PDK 
s(K-l) 



-D) + PDj^j^-, and 



D 
K-l 



7.7* 
i z li . 



We now need l*M~ x . Note that Mj is positive definite and 
one of the eigenvectors of Mj or M^ 1 is while the rest 
are orthogonal to ?j. Hence, l*M~ x is equal to i* times the 
eigenvalue of M^ 1 corresponding to k as the eigenvector. 
Therefore, we get 

(4) 



PD 



K 



K-l / I 2 1 1 ■* ^ A'-l 

£. Analysis of (1 + ||Wi|| 2 ) 
^We have ||^|| 2 _ = E 3 -<i k-jj 
|ft*Wi| 2 -> (1 — is) a.s. Let ac°°(i) denote the limit of (1 

IIW,;!! 2 ) 



Then we can easily obtain the following: 



x°°(i) 



i 



(f) 2 (l-D) 2 (l-j S -)(z S -) 
(l + PD + £(l-D)(l-js)Y 



F. Analysis of ft := - 



]w*) 



From equation we can write W* = (h*LUi)cili for an 
appropriately chosen scalar Cj. Then 

1 

.s 



/l* [wi • • • LJ. 

h,\ I 2 



h*[u>i ■ ■ ■ Lu l -i]l l c l \h*uji\ 2 



7 * * T 

ll. UJ l UJ j lit 



Let us first analyze the terms A = h* [loi ■ ■ -u^-i]^ and £> = 
h*LOiUj*hi in the above equation. To this end, we have 

A = h* [hi pi] [hi pi] * [oji ■ ■ • k 

= h*hih* [u>i ■ ■ -Ui-ijU + h*pip* [uji ■ ■■oj i -i]l i , 
and B = h*hih*ujiUJ*hi + h^ptp^uJiUj^hi. 



Using the arguments developed in Subsection IIII-AI Part 2), 
it can be proved that the terms h*pip* [u>i ■ ■ ■ u;,_i]Zj and 
h*pip*u)iU)*hi converge to zero a.s. Hence, we get 



lim 

K 



Aa\h*uJi\ 2 + B 
h*hih*[uji 



lim 

K 



■ LUi^i]liCi\h*u}i\ 2 + h*hih*ujiLo*h, 



lim\h*h z \ 2 ■ fh*^ 4 



Now, \h*hi 



(l-D) a.s., \h* 



(1 — is) 2 a. s., and 



— > (is). The limiting value of Ci, denoted by c°°(i), can 
be easily obtained from the results of the previous subsection 
(not shown here explicitly). Putting together these results, we 
obtain ft ->• |(1- D)(l - is) (c°°(i) -is + l) 2 a.s. 

Using all the above results, we obtain the asymptotic 
throughput p(P, s, f) as given by equation ©. 

In the simpler case of perfect CSIT (i.e., as f — Y oo), 
the expression for the asymptotic throughput obtained here 
reduces to p(P, s, oo) = s J log {l + iP(l — is)} di, which 
is same as what one would obtain by specializing the result 
of to the 'on-off'-type power policy. 

IV. Numerical Results 

Let the optimal value of s maximizing p for the given 
value of P and f be s opt . Let p opt (P,f) := p(P,s opt ,f). 
The asymptotic throughput for ZFBF is obtained from 1 6 1 . 

In Fig. Q] we plot s opt as function of P for f = 1, 5. It 
can be seen that for lower values of P, s opt increases with 
P. This behavior can be understood by noting that if the 
increased power is allocated to only a few users, then there 
would be a 'logarithmic' increase in the sum-rate; however, 




Fig. 1. s pt vs. P for r = 1 and 5. 




Fig. 2. Percentage improvement in the asymptotic throughput achieved with 
ZFDPC over ZFBF. 



if the power is distributed across more users then the 'pre- 
log' factor gets improved (i.e., more users contribute to the 
sum-rate). However, increasing s also increases the inter-user 
interference. Hence, at higher values of P, s opt becomes 
constant. Next, s opt increases with f because the inter-user 
interference reduces with increasing f. Finally, note that s opt 
for ZFDPC is higher than that for ZFBF. As we would discuss 
below, ZFDPC manages the channel gain of the useful signal 
and the inter-user interference more efficiently than ZFBF and 
hence, s opt corresponding to it turns out to be higher. 

In Fig. |2] we compare the asymptotic throughput obtained 
using ZFDPC and ZFBF. The numerical results in this figure 
pertain to p opt . Here we plot the percentage improvement 
achieved using ZFDPC over ZFBF against P for three values 
of f. We can see that ZFDPC achieves a considerably higher 



p(P, s,f) =s{ log NR 



log [ NR • x°°(i) - — (1 - 2- F )(l - <*)( P ,^ (1 r 01 - 5 ? 

o I W s A ' \l + ^-(l-2- r ){l -is) + P2- 



di 



(3) 



P - - (&) 2 (l-2- f ) 2 (l~is)(is) 
where NR := lim nr = 1 + — (1 - 2~ r ) + P2~ r is independent of i; and x°°(i) = 1 + -^-^ — — 5 . 

s ' (l + P2- ? + -e(l-2--)(l-is)) 




s 



Fig. 3. Throughput achieved using ZFDPC vs. P for K = 5 and r = 10. 

throughput than ZFBF at all values of P and f. Note that even 
for r as low as 0.5, the percentage improvement is in the range 
of 10% to 20%, which is significant. 

Let us now examine the differences between ZFDPC and 
ZFBF. Consider the i th user. Under ZFDPC, the interference 
due to users 1 to i — 1 is canceled by DPC. This can be 
accomplished for any given choice of the BF vectors of users 
1 to i, as long as the inflation factor Wi is chosen in the 
appropriate manner. The BF vectors are chosen under ZFDPC 
in such a way that the interference due to users i + 1 to 
s gets zeroforced at the i th user. In contrast to this, under 
ZFBF, the BF vectors are selected so as to zeroforce the 
interference due to all other users. With this background, let us 
now analyze the channel gain of the useful signal, i.e., the term 
ij/ijo;^ 2 . Under ZFDPC, it is proportional the 7X2(^-1+1) 
RV (recall that the other additive terms converge to zero in 
limit), whereas under ZFBF, it is proportional to ^^(K-s+i) 
RV Mi J6). Thus, except for the user i = s, every other 
user experiences a stronger channel under ZFDPC. In other 
words, DPC (or ZFDPC) manages the channel gain of the 
useful signal and the interference together more effectively 
than ZFBF. It was known that, due to these differences, 
ZFDPC outperforms ZFBF under perfect CSIT 0. In the 
light of the results obtained here, we conclude that the same 
is true even under imperfect CSIT as well. Lastly, it must 
be noted that with ZFBF, the asymptotic throughput is zero 
when 8 = 1, i.e., p(P, 1, f) = VP, f. Note that for ZFDPC, 
p(P,l,f) is comparable to p opt (P,f). This behavior can be 
easily understood by noting the distribution of the channel gain 
of the useful signal under two transmission schemes. 

Consider now Fig. [3] Here, we plot the throughput (i.e., 
the sum-rate normalized by K) for the GBC with K = 5 
and r — 10. We see from the figure that by optimizing over 
the number of users, an improvement of about 0.2 nats can 
be obtained (at all values of P), over the simple solution 
of transmitting to all K users. This improvement is quite 
significant, especially at P = dB and P = 10 dB. The 
question now is how to determine the optimal number of 
'on' users in the A'-dimensional GBC for a given P and r. 
Instead of performing an exhaustive search over s, we propose 



a simple and computationally efficient approach. First find 
s pt for the given P and f = j^. Then we suggest that for 
the A-dimensional GBC, select s op t := s op tK (rounded to 
the nearest integer) number of users. In Fig. [3] we see that 
the maximum value of the normalized throughput is indeed 
attained at s op t (rounded to the nearest integer). We have 
observed this simple method to work quite accurately, even for 
the relatively small values of K (K = 5 here). Note that the 
method suggested for the ZFBF in [6| using their large system 
analysis for selecting the number of users is more complicated 
than the one suggested here but seems to provide no particular 
benefit over this simple approach. 

V. Conclusion 

We provide a large-system analysis of the GBC with finite- 
rate feedback and derive a closed-form expression for the 
asymptotic throughput achievable using ZFDPC. Using this 
result, we show that the DPC -based scheme achieves a signif- 
icantly higher throughput than ZFBF. For the first time, DPC 
is shown to have a better performance, under imperfect CSIT. 
Also, using the asymptotic throughput expression, we address 
the problem of optimizing over the number of 'on' users. 
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